{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "6b3ba1cc",
   "metadata": {},
   "source": [
    "# Tweet Classifier\n",
    "\n",
    "This example will show you how to determine sentiment of tweets, the original example is on [OpenAI Example](https://platform.openai.com/examples/default-tweet-classifier), the difference is that we will teach you how to cache the response for exact and similar matches with **gptcache**, it will be very simple, you just need to add an extra step to initialize the cache.\n",
    "\n",
    "Before running the example, make sure the `OPENAI_API_KEY` environment variable is set by executing `echo $OPENAI_API_KEY`. If it is not already set, it can be set by using `export OPENAI_API_KEY=YOUR_API_KEY` on Unix/Linux/MacOS systems or `set OPENAI_API_KEY=YOUR_API_KEY` on Windows systems.\n",
    "\n",
    "Then we can learn the usage and acceleration effect of gptcache by the following code, which consists of three parts, the original openai way, the exact search and the similar search."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "aa0ba70e",
   "metadata": {},
   "source": [
    "## OpenAI API original usage"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "80e9dae2",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Tweet: I loved the new Batman movie!\n",
      "Time consuming: 0.81s\n",
      "Sentiment: Positive\n",
      "\n"
     ]
    }
   ],
   "source": [
    "import time\n",
    "import openai\n",
    "\n",
    "\n",
    "def response_text(openai_resp):\n",
    "    return openai_resp['choices'][0]['message']['content']\n",
    "\n",
    "tweet = \"I loved the new Batman movie!\"\n",
    "\n",
    "# OpenAI API original usage\n",
    "start_time = time.time()\n",
    "response = openai.ChatCompletion.create(\n",
    "  model='gpt-3.5-turbo',\n",
    "  messages=[\n",
    "    {\n",
    "        'role': 'user',\n",
    "        'content': f\"Decide whether a Tweet's sentiment is positive, neutral, or negative.\\n\\nTweet: \\\"{tweet}\\\"\\nSentiment:\",\n",
    "    }\n",
    "  ],\n",
    ")\n",
    "print(f'Tweet: {tweet}')\n",
    "print(\"Time consuming: {:.2f}s\".format(time.time() - start_time))\n",
    "print(f'Sentiment: {response_text(response)}\\n')"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9d871550",
   "metadata": {},
   "source": [
    "## OpenAI API + GPTCache, exact match cache\n",
    "\n",
    "Initalize the cache to run GPTCache and import `openai` form `gptcache.adapter`, which will automatically set the map data manager to match the exact cahe, more details refer to [build your cache](https://gptcache.readthedocs.io/en/dev/usage.html#build-your-cache).\n",
    "\n",
    "And if you send the exact same tweets, the answer to the second tweet will be obtained from the cache without requesting ChatGPT again."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "024484f3",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Cache loading.....\n",
      "Tweet: The weather today is neither good nor bad\n",
      "Time consuming: 0.62s\n",
      "Sentiment: neutral\n",
      "\n",
      "Tweet: The weather today is neither good nor bad\n",
      "Time consuming: 0.00s\n",
      "Sentiment: neutral\n",
      "\n"
     ]
    }
   ],
   "source": [
    "import time\n",
    "\n",
    "def response_text(openai_resp):\n",
    "    return openai_resp['choices'][0]['message']['content']\n",
    "\n",
    "print(\"Cache loading.....\")\n",
    "\n",
    "# To use GPTCache, that's all you need\n",
    "# -------------------------------------------------\n",
    "from gptcache import cache\n",
    "from gptcache.adapter import openai\n",
    "\n",
    "cache.init()\n",
    "cache.set_openai_key()\n",
    "# -------------------------------------------------\n",
    "\n",
    "tweet = \"The weather today is neither good nor bad\"\n",
    "for _ in range(2):\n",
    "    start_time = time.time()\n",
    "    response = openai.ChatCompletion.create(\n",
    "      model='gpt-3.5-turbo',\n",
    "      messages=[\n",
    "        {\n",
    "            'role': 'user',\n",
    "            'content': f\"Decide whether a Tweet's sentiment is positive, neutral, or negative.\\n\\nTweet: \\\"{tweet}\\\"\\nSentiment:\",\n",
    "        }\n",
    "      ],\n",
    "    )\n",
    "    print(f'Tweet: {tweet}')\n",
    "    print(\"Time consuming: {:.2f}s\".format(time.time() - start_time))\n",
    "    print(f'Sentiment: {response_text(response)}\\n')"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6f2ff699",
   "metadata": {},
   "source": [
    "## OpenAI API + GPTCache, similar search cache\n",
    "\n",
    "We are going to use [DocArray's in-memory](https://docs.docarray.org/user_guide/storing/index_in_memory/) index to perform similarity search.\n",
    "\n",
    "Set the cache with `embedding_func` to generate embedding for the text, and `data_manager` to manager the cache data, `similarity_evaluation` to evaluate the similarities, more details refer to [build your cache](https://gptcache.readthedocs.io/en/dev/usage.html#build-your-cache).\n",
    "\n",
    "After obtaining an answer from ChatGPT in response to several similar tweets, the answers to subsequent questions can be retrieved from the cache without the need to request ChatGPT again."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "fd1ff06e",
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/Users/jinaai/Desktop/GPTCache/venv1/lib/python3.8/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n",
      "  from .autonotebook import tqdm as notebook_tqdm\n",
      "None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Cache loading.....\n",
      "Tweet: The new restaurant in town exceeded my expectations with its delectable cuisine and impeccable service\n",
      "Time consuming: 0.70s\n",
      "Sentiment:  Positive\n",
      "\n",
      "Tweet: New restaurant in town exceeded my expectations with its delectable cuisine and impeccable service\n",
      "Time consuming: 0.59s\n",
      "Sentiment:  Positive\n",
      "\n",
      "Tweet: The new restaurant exceeded my expectations with its delectable cuisine and impeccable service\n",
      "Time consuming: 0.74s\n",
      "Sentiment:  Positive\n",
      "\n"
     ]
    }
   ],
   "source": [
    "import time\n",
    "\n",
    "def response_text(openai_resp):\n",
    "    return openai_resp['choices'][0]['message']['content']\n",
    "\n",
    "from gptcache import cache\n",
    "from gptcache.adapter import openai\n",
    "from gptcache.embedding import Onnx\n",
    "from gptcache.manager import CacheBase, VectorBase, get_data_manager\n",
    "from gptcache.similarity_evaluation.distance import SearchDistanceEvaluation\n",
    "\n",
    "print(\"Cache loading.....\")\n",
    "\n",
    "onnx = Onnx()\n",
    "data_manager = get_data_manager(CacheBase(\"sqlite\"), VectorBase(\"docarray\"))\n",
    "cache.init(\n",
    "    embedding_func=onnx.to_embeddings,\n",
    "    data_manager=data_manager,\n",
    "    similarity_evaluation=SearchDistanceEvaluation(),\n",
    "    )\n",
    "cache.set_openai_key()\n",
    "\n",
    "tweets = [\n",
    "    \"The new restaurant in town exceeded my expectations with its delectable cuisine and impeccable service\",\n",
    "    \"New restaurant in town exceeded my expectations with its delectable cuisine and impeccable service\",\n",
    "    \"The new restaurant exceeded my expectations with its delectable cuisine and impeccable service\",\n",
    "]\n",
    "\n",
    "for tweet in tweets:\n",
    "    start_time = time.time()\n",
    "    response = openai.ChatCompletion.create(\n",
    "        model='gpt-3.5-turbo',\n",
    "        messages=[\n",
    "            {\n",
    "                'role': 'user',\n",
    "                'content': f\"Decide whether a Tweet's sentiment is positive, neutral, or negative.\\n\\nTweet: \\\"{tweet}\\\"\\nSentiment:\",\n",
    "            }\n",
    "        ],\n",
    "    )\n",
    "    print(f'Tweet: {tweet}')\n",
    "    print(\"Time consuming: {:.2f}s\".format(time.time() - start_time))\n",
    "    print(f'Sentiment: {response_text(response)}\\n')"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.8"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
